Segmentation of touching handwritten Japanese characters using the graph theory method
نویسنده
چکیده
Projection analysis methods have been widely used to segment Japanese character strings. However, if adjacent characters have overhanging strokes or a touching point doesn’t correspond to the histogram minimum, the methods are prone to result in errors. In contrast, non-projection analysis methods being proposed for use on numerals or alphabet characters cannot be simply applied for Japanese characters because of the differences in the structure of the characters. Based on the oversegmenting strategy, a new pre-segmentation method is presented in this paper: touching patterns are represented as graphs and touching strokes are regarded as the elements of proper edge cutsets. By using the graph theoretical technique, the cutset matrix is calculated. Then, by applying pruning rules, potential touching strokes are determined and the patterns are oversegmented. Moreover, this algorithm was confirmed to be valid for touching patterns with overhanging strokes and doubly connected patterns in simulations.
منابع مشابه
Segmentation of Isolated and Touching Characters in Offline Handwritten Gurmukhi Script Recognition
Segmentation of a word into characters is one of the important challenges in optical character recognition. This is even more challenging when we segment characters in an offline handwritten document. Touching characters make this problem more complex. In this paper, we have applied water reservoir based technique for identification and segmentation of touching characters in handwritten Gurmukh...
متن کاملA Novel Approach of Segmenting Touching and Kerned Characters
Character segmentation is a critical step of OCR system. In this paper we discussed segmentation approaches of touching and kerned characters.A non-linear segmentation pathbased algorithm for segmenting touching and kerned characters is put forward. First, touching and kerned characters are extracted and segregated with other characters by using character projections and recognition results.The...
متن کاملAdding Feedback to Improve Segmentation and Recognition of
WinBank is a system which performs automated reading of handwritten documents, particularly of strings of handwritten numerals on bank checks. A commonly-used strategy for reading handwritten strings is a combination of segmentation and recognition. In the WinBank program, segmentation involves both the separation of touching characters and the merging of character fragments to other pieces. An...
متن کاملA Script Independent Technique for Extraction of Characters from Handwritten Word Images
A script independent character segmentation from word images technique has been reported here. Word to character segmentation is an important preprocessing step of optical character recognition process. But in case of handwritten text, presence of touching characters decreases the accuracy of the technique of the segmentation of the characters from the word. In this paper, segmentation of handw...
متن کاملA Survey on Word Segmentation Method for Handwritten Documents
One of the most important and challenging tasks in a handwritten recognition pipeline is the segmentation of handwritten document images into text lines and words. Several problems inherent in handwritten documents such as the difference in the skew angle between text lines or along the same text line, the existence of adjacent text lines or words touching, the existence of characters with diff...
متن کامل